Goto

Collaborating Authors

 adverse event


Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition

Neural Information Processing Systems

We develop a Bayesian model for decision-making under time pressure with endogenous information acquisition. In our model, the decision-maker decides when to observe (costly) information by sampling an underlying continuous-time stochastic process (time series) that conveys information about the potential occurrence/non-occurrence of an adverse event which will terminate the decision-making process. In her attempt to predict the occurrence of the adverse event, the decision-maker follows a policy that determines when to acquire information from the time series (continuation), and when to stop acquiring information and make a final prediction (stopping). We show that the optimal policy has a rendezvous structure, i.e. a structure in which whenever a new information sample is gathered from the time series, the optimal date for acquiring the next sample becomes computable. The optimal interval between two information samples balances a trade-off between the decision maker's surprise, i.e. the drift in her posterior belief after observing new information, and suspense, i.e. the probability that the adverse event occurs in the time interval between two information samples. Moreover, we characterize the continuation and stopping regions in the decision-maker's state-space, and show that they depend not only on the decision-maker's beliefs, but also on the context, i.e. the current realization of the time series.


Health Department Will Mine Unverified Vaccine Injury Claims With New AI Tool

Mother Jones

Experts worry it will be used to further Robert F. Kennedy Jr.'s anti-vaccine agenda. Get your news from a source that's not owned and controlled by oligarchs. The US Department of Health and Human Services (HHS) is developing a generative artificial intelligence tool to find patterns across data reported to a national vaccine monitoring database and to generate hypotheses on the negative effects of vaccines, according to an inventory released last week of all use cases the agency had for AI in 2025. The tool has not yet been deployed, according to the HHS document, and an AI inventory report from the previous year shows that it has been in development since late 2023. But experts worry that the predictions it generates could be used by HHS secretary Robert F. Kennedy Jr. to further his anti-vaccine agenda.


HHS Is Making an AI Tool to Create Hypotheses About Vaccine Injury Claims

WIRED

Experts worry Robert F. Kennedy Jr.'s Health Department will use an internal AI tool to analyze vaccine injury claims in a way that furthers his anti-vaccine agenda. The US Department of Health and Human Services is developing a generative artificial intelligence tool to find patterns across data reported to a national vaccine monitoring database and to generate hypotheses on the negative effects of vaccines, according to an inventory released last week of all use cases the agency had for AI in 2025. The tool has not yet been deployed, according to the HHS document, and an AI inventory report from the previous year shows that it has been in development since late 2023. But experts worry that the predictions it generates could be used by Health and Human Services secretary Robert F. Kennedy Jr. to further his anti-vaccine agenda. A long-standing vaccine critic, Kenedy has upended the childhood vaccination schedule in his year in office, removing several shots from a list of recommended immunizations for all children, including those for Covid-19, influenza, hepatitis A and B, meningococcal disease, rotavirus, and respiratory syncytial virus, or RSV.


A Field Guide to Deploying AI Agents in Clinical Practice

Gallifant, Jack, Kellogg, Katherine C., Butler, Matt, Centi, Amanda, Chen, Shan, Doyle, Patrick F., Dutta, Sayon, Guo, Joyce, Hadfield, Matthew J., Kim, Esther H., Kozono, David E., Aerts, Hugo JWL, Landman, Adam B., Mak, Raymond H., Mishuris, Rebecca G., Nelson, Tanna L., Savova, Guergana K., Sharon, Elad, Silverman, Benjamin C., Topaloglu, Umit, Warner, Jeremy L., Bitterman, Danielle S.

arXiv.org Artificial Intelligence

Large language models (LLMs) integrated into agent-driven workflows hold immense promise for healthcare, yet a significant gap exists between their potential and practical implementation within clinical settings. To address this, we present a practitioner-oriented field manual for deploying generative agents that use electronic health record (EHR) data. This guide is informed by our experience deploying the "irAE-Agent", an automated system to detect immune-related adverse events from clinical notes at Mass General Brigham, and by structured interviews with 21 clinicians, engineers, and informatics leaders involved in the project. Our analysis reveals a critical misalignment in clinical AI development: less than 20% of our effort was dedicated to prompt engineering and model development, while over 80% was consumed by the sociotechnical work of implementation. We distill this effort into five "heavy lifts": data integration, model validation, ensuring economic value, managing system drift, and governance. By providing actionable solutions for each of these challenges, this field manual shifts the focus from algorithmic development to the essential infrastructure and implementation work required to bridge the "valley of death" and successfully translate generative AI from pilot projects into routine clinical care.


Automated PRO-CTCAE Symptom Selection based on Prior Adverse Event Profiles

Vandenhende, Francois, Georgiou, Anna, Georgiou, Michalis, Psaras, Theodoros, Karekla, Ellie

arXiv.org Artificial Intelligence

The PRO-CTCAE is an NCI-developed patient-reported outcome system for capturing symptomatic adverse events in oncology trials. It comprises a large library drawn from the CTCAE vocabulary, and item selection for a given trial is typically guided by expected toxicity profiles from prior data. Selecting too many PRO-CTCAE items can burden patients and reduce compliance, while too few may miss important safety signals. We present an automated method to select a minimal yet comprehensive PRO-CTCAE subset based on historical safety data. Each candidate PRO-CTCAE symptom term is first mapped to its corresponding MedDRA Preferred Terms (PTs), which are then encoded into Safeterm, a high-dimensional semantic space capturing clinical and contextual diversity in MedDRA terminology. We score each candidate PRO item for relevance to the historical list of adverse event PTs and combine relevance and incidence into a utility function. Spectral analysis is then applied to the combined utility and diversity matrix to identify an orthogonal set of medical concepts that balances relevance and diversity. Symptoms are rank-ordered by importance, and a cut-off is suggested based on the explained information. The tool is implemented as part of the Safeterm trial-safety app. We evaluate its performance using simulations and oncology case studies in which PRO-CTCAE was employed. This automated approach can streamline PRO-CTCAE design by leveraging MedDRA semantics and historical data, providing an objective and reproducible method to balance signal coverage against patient burden.


Lightweight Sequential Transformers for Blood Glucose Level Prediction in Type-1 Diabetes

Barbato, Mirko Paolo, Rigamonti, Giorgia, Marelli, Davide, Napoletano, Paolo

arXiv.org Artificial Intelligence

-- Type 1 Diabetes (T1D) affects millions worldwide, requiring continuous monitoring to prevent severe hypo-and hyperglycemic events. While continuous glucose monitoring has improved blood glucose management, deploying predictive models on wearable devices remains challenging due to computational and memory constraints. T o address this, we propose a novel Lightweight Sequential Transformer model designed for blood glucose prediction in T1D. The model is optimized for deployment on resource-constrained edge devices and incorporates a balanced loss function to handle the inherent data imbalance in hypo-and hyperglycemic events. Experiments on two benchmark datasets, OhioT1DM and DiaTrend, demonstrate that the proposed model outperforms state-of-the-art methods in predicting glucose levels and detecting adverse events. This work fills the gap between high-performance modeling and practical deployment, providing a reliable and efficient T1D management solution. Type 1 Diabetes (T1D) [1] is a chronic autoimmune condition requiring lifelong blood glucose concentration (BGC) monitoring to prevent life-threatening complications such as hypoglycemia (BGC below 70 mg/dL [2]) and hyperglycemia (BGC above 180 mg/dL [3]).


Knowledge-based Graphical Method for Safety Signal Detection in Clinical Trials

Vandenhende, Francois, Georgiou, Anna, Georgiou, Michalis, Psaras, Theodoros, Karekla, Ellie, Hadjicosta, Elena

arXiv.org Artificial Intelligence

We present a graphical, knowledge-based method for reviewing treatment-emergent adverse events (AEs) in clinical trials. The approach enhances MedDRA by adding a hidden medical knowledge layer (Safeterm) that captures semantic relationships between terms in a 2-D map. Using this layer, AE Preferred Terms can be regrouped automatically into similarity clusters, and their association to the trial disease may be quantified. The Safeterm map is available online and connected to aggregated AE incidence tables from ClinicalTrials.gov. For signal detection, we compute treatment-specific disproportionality metrics using shrinkage incidence ratios. Cluster-level EBGM values are then derived through precision-weighted aggregation. Two visual outputs support interpretation: a semantic map showing AE incidence and an expectedness-versus-disproportionality plot for rapid signal detection. Applied to three legacy trials, the automated method clearly recovers all expected safety signals. Overall, augmenting MedDRA with a medical knowledge layer improves clarity, efficiency, and accuracy in AE interpretation for clinical trials.


Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition

Neural Information Processing Systems

We develop a Bayesian model for decision-making under time pressure with endogenous information acquisition. In our model, the decision-maker decides when to observe (costly) information by sampling an underlying continuous-time stochastic process (time series) that conveys information about the potential occurrence/non-occurrence of an adverse event which will terminate the decision-making process. In her attempt to predict the occurrence of the adverse event, the decision-maker follows a policy that determines when to acquire information from the time series (continuation), and when to stop acquiring information and make a final prediction (stopping). We show that the optimal policy has a rendezvous structure, i.e. a structure in which whenever a new information sample is gathered from the time series, the optimal date for acquiring the next sample becomes computable. The optimal interval between two information samples balances a trade-off between the decision maker's surprise, i.e. the drift in her posterior belief after observing new information, and suspense, i.e. the probability that the adverse event occurs in the time interval between two information samples. Moreover, we characterize the continuation and stopping regions in the decision-maker's state-space, and show that they depend not only on the decision-maker's beliefs, but also on the context, i.e. the current realization of the time series.


Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition

Ahmed M. Alaa, Mihaela Van Der Schaar

Neural Information Processing Systems

We develop a Bayesian model for decision-making under time p ressure with endogenous information acquisition. In our model, the decisi on-maker decides when to observe (costly) information by sampling an underlying c ontinuous-time stochastic process (time series) that conveys informa tion about the potential occurrence/non-occurrence of an adverse event which will t erminate the decision-making process. In her attempt to predict the occurrence of t he adverse event, the decision-maker follows a policy that determines when to acquire information from the time series (continuation), and when to stop acquiring information and make a final prediction (stopping). We show that the optimal polic y has a " rendezvous" structure, i.e. a structure in which whenever a new informat ion sample is gathered from the time series, the optimal "date" for acquiring the ne xt sample becomes computable. The optimal interval between two information s amples balances a trade-off between the decision maker's "surprise", i.e. th e drift in her posterior belief after observing new information, and "suspense", i. e. the probability that the adverse event occurs in the time interval between two inf ormation samples. Moreover, we characterize the continuation and stopping re gions in the decision-maker's state-space, and show that they depend not only on th e decision-maker's beliefs, but also on the "context", i.e. the current realiza tion of the time series.


Robust or Suggestible? Exploring Non-Clinical Induction in LLM Drug-Safety Decisions

Liu, Siying, Zhang, Shisheng, Bala, Indu

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly applied in biomedical domains, yet their reliability in drug-safety prediction remains underexplored. In this work, we investigate whether LLMs incorporate socio-demographic information into adverse event (AE) predictions, despite such attributes being clinically irrelevant. Using structured data from the United States Food and Drug Administration Adverse Event Reporting System (FAERS) and a persona-based evaluation framework, we assess two state-of-the-art models, ChatGPT-4o and Bio-Medical-Llama-3.8B, across diverse personas defined by education, marital status, employment, insurance, language, housing stability, and religion. We further evaluate performance across three user roles (general practitioner, specialist, patient) to reflect real-world deployment scenarios where commercial systems often differentiate access by user type. Our results reveal systematic disparities in AE prediction accuracy. Disadvantaged groups (e.g., low education, unstable housing) were frequently assigned higher predicted AE likelihoods than more privileged groups (e.g., postgraduate-educated, privately insured). Beyond outcome disparities, we identify two distinct modes of bias: explicit bias, where incorrect predictions directly reference persona attributes in reasoning traces, and implicit bias, where predictions are inconsistent, yet personas are not explicitly mentioned. These findings expose critical risks in applying LLMs to pharmacovigilance and highlight the urgent need for fairness-aware evaluation protocols and mitigation strategies before clinical deployment.